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METHOD OF AND SYSTEM FOR MANAfifNfi MULTI-DIMENSIONAL DATABASES 
USING MODULAR-ARITHMETIC BASED ADDRESS DATA MAPPI NG PROCESSES 



BACKGROUND OF THE 



The present invention relates to an improved method of and a system for managing data 
elements in a multi-dimensional database (MDB) supported upon a parallel computing platform 
using improved address data mapping (i.e. translation) processes, and more particularly, to an 
improved method of and a system for managing data elements within a MDB during on-line 
analytical processing (OLAP) operations. 

Background Art 

The ability to act quickly and decisively in our increasingly competitive marketplace is . 
critical to the success of an organization. The volume of information that is available to 
corporations is rapidly increasing and frequently overwhelming. Those which effectively and 
efficiently manage such tremendous volumes of data, and use the information to make business 
decisions, will realize a significant competitive advantage in the marketplace. 

The creation of an enterprise-wide data store, known as data warehousing , is the first 
step towards managing these volumes of data. The Data Warehouse is becoming an integral part 
of many information delivery systems because it provides a single, central location where a 
reconciled version of data extracted from a wide variety of operational systems is stored. Over 
the last few years, improvements in price, performance, scalability, and robustness of open 
computing systems have made data warehousing a central component of Information 
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Technology strategies. Details on methods of data integration and data warehouse construction 
can be found in the white paper entitled Data Integration: The Warehouse Foundation by 
Louis Rolleigh and Joe Thomas, published at http://www.acxiom.cQm/whitenapers/wp- 1 1 ^ P . 
Building a Data Warehouse has its own special challenges (e.g. using common data 
5 model, common business dictionary, etc.) and is a complex endeavor. However, just having a 
Data Warehouse does not provide organizations with the often-heralded business benefits of 
data warehousing. To complete the supply chain from transactional systems to decision maker* 
organizations need to deliver systems that allow knowledge workers to make strategic and 
tactical decisions based on the information stored in these warehouses. These decision support 

10 systems are referred to as On-Line Analytical Processing (OLAP) systems. OLAP systems 
allow knowledge workers to intuitively, quickly, and flexibly manipulate operational data using 
familiar business terms, in order to provide analytical insight into a particular problem or line of 
inquiry. For example, by using an OLAP system, decision makers can slice and dice 
information along a customer (or business) dimension, and view business metrics by product 

IS and through time. Reports can be defined from multiple perspectives that provide a high-level 
or detailed view of the performance of any aspect of the business. Decision makers can 
navigate throughout their database by drilling down on a report to view elements at finer levels 
of detail, or by pivoting to view reports from different perspectives. To enable such full- 
functioned business analyses, OLAP systems need to (1) support sophisticated analyses, (2) 

20 scale to large numbers of dimensions, and (3) support analyses against large atomic data sets. 
These three key requirements are discussed further below. 

Decision makers use key performance metrics to evaluate the operations within their 
domain, and OLAP systems need to be capable of delivering these metrics in a user- 
customizable format These metrics may be obtained from the transactional databases 

25 precalculated and stored in the database, or generated on demand during the query process. 
Commonly used metrics include: 

(1 ) Multidimensional Ratios (e.g. Percent to Total) 
Show me the contribution to weekly soles and category profit made by all items sold in the 
Northwest stores between July I and July 14. 

30 (2) Comparisons (e.g. Actual vs. Plan, This Period vs. Last Period) 

-2- 



SUBSTITUTE SHEET (RULE 26) 



WO 01/11497 



PCT/IBOO/OilOO 



Show me the sates to plan percentage variation for this year and compare it to that of the 
previous year to identify planning discrepancies. 

(3) Ranking and Statistical Profiles (e.g. Top N/Bottom N, 70/30, Quartiles) 

Show me sales, profit and average call volume per day for my 20 most profitable salespeople, 
5 who are in the top 30% of the worldwide sales. 

(4) Custom Consolidations (e.g. Financial Consolidations, Market Segments, Ad Hoc 
Groups) 

Show me an abbreviated income statement by quarter for the last two quarters for my 
Western Region operations. 

10 Knowledge workers analyze data from a number of different business perspectives or 

dimensions. As used hereinafter, a dimension is any element or hierarchical combination of 
elements in a data model that can be displayed orthogonally with respect to other combinations 
of elements in the data model. For example, if a report lists sales by week, promotion, store, 
and department, then the report would be a slice of data taken from a four-dimensional data 

15 modeL 

Target marketing and market segmentation applications involve extracting highly 
qualified result sets from large volumes of data. For example, a direct marketing organization 
might want to generate a targeted mailing list based on dozens of characteristics, including 
purchase frequency, purchase recency, size of the last purchase, past buying trends, customer 

20 location, age of customer, and gender of customer. These applications rapidly increase the 
dimensionality requirements for analysis. 

The number of dimensions in OLAP systems range from a few orthogonal dimensions to 
hundreds of orthogonal dimensions. Orthogonal dimensions in an examplary OLAP application 
might include Geography, Time, and Products. 

25 Atomic data refers to the lowest level of data granularity required for effective decision 

making. In the case of a retail merchandising manager, "atomic data" may refer to information 
* by store, by day, and by item. For a banker, atomic data may be information by account by 
transaction by branch. Most organizations implementing OLAP systems fmd themselves 
needing systems that can scale to tens, hundreds, and even thousands of gigabytes of atomic 

30 information. 
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As OLAP systems become more pervasive and are used by the majority of the 
enterprise, more data over longer time frames will be included in the data store (i.e. data 
warehouse), and the size of die database will increase by at least an order of magnitude. Thus, 
OLAP systems need to be able to scale from present to near-future volumes of data. 

In general, OLAP systems need to (1 ) support the complex analysis requirements of 
decision-makers, (2) analyze the data from a number of different perspectives (i.e. business 
dimensions), and (3) support complex analyses against large input (atomic-level) data sets from 
a Data Warehouse maintained by the organization using a relational database management 
system (RDBMS). 

Vendors of OLAP systems classify OLAP Systems as either Relational OLAP 
(ROLAP) or Multidimensional OLAP (MOLAP) based on the underlying architecture thereof. 
Thus, there are two basic architectures for On- Line Analytical Processing systems: The 
ROLAP Architecture, and the MOLAP architecture. 

Overview of The Relational OLAP Architecture 

The Relational OLAP (ROLAP) system accesses data stored in a Data Warehouse to 
provide OLAP analyses. The premise of ROLAP is that OLAP capabilities are best provided 
directly against the relational database, i.e. the Data Warehouse. An overview of the ROLAP 
architecture is provided in Fig. 1 A. 

The ROLAP architecture was invented to enable direct access of data from Data 
Warehouses, and therefore support optimization techniques to meet batch window 
requirements and provide fast response times. Typically, these optimization techniques 
typically include application-level table partitioning, pre-aggregate inferencing, denonnalizatton 
support, and the joining of multiple fact tables. 

As shown in Fig. 1 A, a typical prior art ROLAP system has a three-tier or layer 
client/server architecture. The "database layer" utilizes relational databases for data storage, 
access, and retrieval processes. The "application logic layer" is the ROLAP engine which 
executes the multidimensional reports from multiple users. The ROLAP engine integrates with 
a variety of "presentation layers," through which users perform OLAP analyses. 
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As shown in Fig. I A, after the data model for the data warehouse is defined, data from 
on-line transaction-processing (OLTP) systems is loaded into the relational database 
management system (RDBMS). If required by the data model, database routines are run to pie- 
aggregate the data within the RDBMS. Indices are then created to optimize query access times. 

5 End users submit multidimensional analyses to the ROLAP engine, which then dynamically 
transform the requests into SQL execution plans. The SQL execution plans are submitted to the 
relational database for processing, die relational query results are cross-tabulated, and a 
multidimensional result data set is returned to the end user. ROLAP is a fully dynamic 
architecture capable of utilizing precalculated results when they are available, or dynamically 

10 generating results from atomic information when necessary. 

Qvqrvjsw of MOLAP Architecture 

Multidimensional OLAP (MOLAP) systems utilize a proprietary multidimensional 
database (MDB) to provide OLAP analyses. The main premise of this architecture is that data 

1 5 must be stored multidimensional^ to be accessed and viewed multi-dimensionally . 

As shown in Fig. IB, a typical prior art MOLAP system has a two-tier or layer 
client/server architecture. In this architecture, the MDB serves as both the database layer and 
the application logic layer. In the database layer, the MDB system is responsible for all data 
storage, access, and retrieval processes. In the application logic layer, the MDB is responsible 

20 for the execution of all OLAP requests. The presentation layer integrates with the application 
logic layer and provides an interface, through which die end users view and request OLAP 
analyses on their client machines which may be web-enabled through the infrastructure of the 
Internet. The client/server architecture of a MOLAP system allows multiple users to access the 
same multidimensional database (MDB). 

25 As shown in Fig. 2A, information (i.e. basic data) from a variety of operational systems 

within an enterprise, comprising the Data Warehouse, is loaded into a prior art multidimensional 
database (MDB) through a series of batch routines. The Express" server by the Oracle 
Corporation is examplary of a popular server can be used to carry out the data loading process 
in prior art MOLAP systems. As shown in Fig. 2B an exemplary 3-D MDB is schematically 

30 depicted, showing geography, time and products as the "dimensions" of the database. The 
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multidimensional data of the MDB is organized in an array structure, as shown in Fig. 2C. 
Physically, the Express™ server stores data in pages (or records) of an information file. Pages 
contain 512, or 2048, or 4096 bytes of data, depending on the platform and release of die 
Express™ server. In order to look up the physical record address from the database file 
5 recorded on a disk or other mass storage device, the Express™ server generates a data structure 
referred to as a Page Allocation Table (PAT) . As shown in Fig. 2D, the PAT tells the 
Express™ server the physical record number that contains the page of data. Typically, the 
PAT is organized in pages. The simplest way to access a data element in the MDB is by 
calculating the "offset" using the additions and multiplications expressed by a simple formula; 

10 

Offset = Months + Product ♦ (* ofJAonths) + City * (U ofJAonths * # ofJ>roducts) 

During an OLAP session, the response time of a multidimensional query on a prior art 
MDB depends on how many cells in the MDB have to be added "on the fly". As the number 

15 of dimensions in the MDB increases linearly, the number of the celts in the MDB increases 
exponentially. However, it is known that the majority of multidimensional queries deal with 
summarized high level data Thus, as shown in Figs. 3A and 3B t once the atomic data (i.e. 
basic data ) has been loaded into the MDB, the general approach is to perform a series of 
calculations in a batch manner in order to aggregate (i.e. pre -aggregate) the data elements along 

20 the orthogonal dimensions of the MDB and fill the array structures thereof. For example, 

revenue figures for all retail stores in a particular state (i.e. New York) would be added together 
to fill the "state" level cells in the MDB. After the array structure in the database has been 
filled, integer-based indices are created and hashing algorithms are used to improve query access 
times. Pre-aggregation of dimension DO is always performed along the cross-section of the 

25 MDB along the DO dimension. 

As shown in Fig. 3C3, the primarily loaded data in the MDB is organized at its lowest 
dimensional hierarchy. As shown in Figs. 3CI and 3C3, the results of the pre-aggregations are 
stored in the neighboring parts of the MDB. As shown in Fig. 3C2, along the TIME dimension, 
weeks are the aggregation results of days, months are the aggregation results of weeks, and 

30 quarters are the aggregation results of months. While not shown in the figures, along the 
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GEOGRAPHY dimension, states are the aggregation results of cities, countries are the 
aggregation results of states, and continents are the aggregation results of countries. By pre- 
aggregating (i.e. consolidating or compiling) all logical subtotals and totals along all dimensions 
of the MDB, it is possible to cany out real-time MOLAP operations using a multidimensional 
5 database (MDB) containing both basic (i.e. atomic) and pre-aggregated data. 

Once this compilation process has been completed, the MDB is ready for use. Users 
request OLAP reports by submitting queries through the OLAP Application interface (e.g. 
using web-enabled client machines), and the application logic layer responds to the submitted 
queries by retrieving the stored data from the MDB for display on the client machine. Each data 
10 retrieval operation carried out on the MDB involves searching through the Page Allocation 
Tables (e.g. search trees) maintained therefor in order to determine the addresses of the data 
elements needed to answer the query. Because the Page Allocation Tables (PATs) typically 
contain billions of entries, paging of the tables from mass storage memory is often required as 
schematically depicted in Fig. 4. This increases the time required to search the Page Allocation 
1 5 Tables, find the n-dimensional Cartesian addresses for the sought after data elements, convert 
the n-dimensional Cartesian addresses into physical record addresses, and physically access the 
corresponding data records stored within the storage volumes of die MDB. 

Thus, each time die basic or atomic data in the MDB requires updating in any significant 
manner, for any reason, the MOLAP system must carry out computationally intensive data 
20 compilation operations in order to precompile (i.e. pre-aggregate) data within the MDB. The 
graphs plotted in Fig. 5 clearly indicate die computational demands that are created when 
searching an MDB during an OLAP session, where answers to queries are presented to the 
MOLAP system, and answers thereto are solicited often under real-time constraints. However, 
prior ait MOLAP systems have limited capabilities to dynamically create data aggregations or 
25 to calculate business metrics that have not been precalculated and stored in the MDB. 

Thus, there is a great need in the art for an improved way of and means for accessing data 
elements within a multi-dimensional database (MDB) containing precompiled or pre-aggregated 
data and supported on a parallel computing platform during OLAP or like operations, while 
avoiding the shortcomings and drawbacks of prior art systems and methodologies. 
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In view of the computational demands of such prior art MOLAP systems, Applicants 
have recognized that the performance of such systems might be significantly improved, and 
thus made more competitive with and superior to prior art ROLAP systems, if parallel 
processing techniques are used to implement prior art MOLAP processes. 

In Fig. 6, Applicants disclose, as generally disclosed in U.S. Patent No. 5,850,547 
assigned to Oracle Corporation, incorporated herein by reference, parallel computing machine 
(Le. platform) 1 for implementing MOLAP systems. As shown therein, the multidimensional 
database (MDB) 2 is supported on the parallel machine using a plurality of processors 3 
denoted P* P b ,Pp.,, each having DRAM 4 for address data storage during system operation, 
and one or more storage volumes 5 for storing application data and address data. An OLA? 
server 6 (e.g. the Express™ Server from the Oracle Corporation) is provided between the Data 
Warehouse (e.g. RDBMS) 7 and the parallel machine 2. The processors) 8 within the OLAP 
server 6, denoted by P(s), and DRAM 9 and local storage volumes 1 0 associated therewith, are 
in communication with the array of processors 3 in the parallel computing machine 2. Also, as 
shown, each processor 3 in the parallel computing machine 2 has direct access to the mass 
storage volumes within the Data Warehouse 7. For illustration purposes, the processors) used 
in the Data Warehouse 7 are indicated by reference numeral 1 1, whereas its DRAM is indicated 
by reference numberal 12, and its mass storage volumes are indicated by reference numeral 13. 

In principal, the use of parallel processing machines as taught in Fig. 6 should enable 
quick and direct access to an array of answers to the submitted queries, as well as speed up the 
pre-aggregation process and the execution of multidimensional queries and drill-down processes. 
Also, effective parallel processing can be expected only by ensuring that the data is evenly 
distributed data among the processors in the parallel computing system, and that all loads are 
balanced 

In an effort to apply parallel processing techniques to prior art MOLAP systems, a 
number of methods of data element address assignment (i.e. address data translation) have been 
developed, each based on partitioning the array of multidimensional data. The first method 
seeks to partition a conventional array of data by dividing it by the lowest dimension of the 
corresponding MDB, as schematically illustrated in Fig. 7 A. The second method seeks to 
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partition a multidimensional data by dividing it by the highest dimension of the corresponding 
MDB, as schematically illustrated in 7B. 

As indicated in Fig. 7C, the first method of data element address assignment attempts to 
cany out data address assignment using a method of partitioning a multidimensional data set by 
5 dividing it by the lowest dimension of the corresponding MDB. As illustrated in Fig. 7 A, this 
method results in unbalanced data processing among the processors of the parallel computing 
machine, and in sequential* as opposed to parallel, access to data. 

As indicated in Fig. 7C, the second method of data element assignment attempts to cany 
out data address assignment using a method of partitioning a muttidiminsional data set by 
1 0 dividing according the highest dimension of the corresponding MDB. As illustrated in Fig. 7B. 
this method results in unbalanced data processing among the processors of the parallel 
computing machine, and in sequential access to data. 

Surprisingly, Applicants have discovered that implementing a MOLAP system on a 
parallel computing platform, using the data structure of conventional Page Allocation Tables, 
1 5 does not provide increases in system performance (e.g. decreased access/search time) which 
might be expected when parallelizing a serial computing application. 

Accordingly, there is a great need in the art for an improved parallel-based method and 
system for accessing data elements in a MDB without the shortcomings and drawbacks 
associated with prior art techniques such as, for example, in U.S. Patent 5,850,547. 

20 

PISC IQSURE QF THE INVENTION 

Accordingly, it is a primary object of the present invention to provide an improved 
method of and apparatus for accessing data elements within a multidimensional database 
25 (MDB) using a parallel computing platform, achieving a significant increase in system 
performance (e.g. deceased access/search time) using parallel computing techniques. 

Another object of the present invention is to provide such apparatus in the form of an 
improved MOLAP system, wherein the MDB contains precompiled or pre-aggregated data and 
parallel data loading operations are carried out between the Data Warehouse and the MDB of 
30 the system using a novel modular arithematic based data element address assignment scheme 
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which involves mapping (0 integer-encoded MDB dimensions associated with the raw data 
elements accessed from the Data Warehouse, into (ii) integer-encoded data storage addresses 
within the storage volumes associated with the MDB. 

Another object of the present invention is to provide such apparatus in the form of an 
5 improved MOLAP system, wherein parallel data aggregation operations are carried out within 
the MDB of the system using a novel modular arithematic based data element address 
assignment scheme which involves mapping (i) integer-encoded MDB dimensions associated 
with the raw or previously pre-aggregated data elements to be stored within the MDB, into (ii) 
integer-encoded data storage addresses within the storage volumes thereof at which the pre- 
10 aggregated data elements are to be stored 

Another object of the present invention is to provide such apparatus in the form of an 
improved MOLAP system, wherein OLAP operations are carried out within the MDB of the 
system using a novel modular arithematic based data element address assignment scheme which 
involves mapping (0 integer-encoded MDB dimensions associated with pre-aggregated data 
15 elements to be accessed from the MDB, into (ii) integer-encoded data storage addresses within 
the storage volumes thereof; from which the pre-aggregated data elements are to be accessed 

Another object of the present invention is to provide such an improved MOLAP 
system, wherein data processing tasks are evenly distributed among processors on the parallel 
computing platform of the system 
:0 Another object of the present invention is to provide such an improved MOLAP 

system, wherein data elements within die MDB of the system are evenly distributed among the 
processors on the parallel computing platform thereof. 

Another object of the present invention is to provide such an improved MOLAP 
system, wherein each processor on the parallel computing platform handles data elements 
5 assigned thereto during data address assignment operations carried out during parallel data 
loading operations and parallel data aggregation operations within the system 

Another object of the present invention is to provide such an improved MOLAP 
system, wherein there is no need to exchange data among processors on the parallel computing 
platform. 

• 10- 
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Another object of the present invention is to provide such an improved MOLAP 
system, wherein the need for interprocessor communication among the parallel processors is 
minimized 

Another object of the present invention is to provide an improved MOLAP method, 
5 wherein parallel data loading operations are carried out between the Data Warehouse and MDB 
of the system using a data element address assignment scheme that employs mapping of MDB 
dimensions using modular arithemetic. 

Another object of the present invention is to provide such an improved MOLAP 
method, wherein parallel data aggregation operations are carried out between the Data 
10 Warehouse and MDB of the system using a data element address assignment scheme that 
employs mapping of MDB dimensions using modular arithematic. 

Another object of the present invention is to provide such an improved MOLAP 
method, wherein data processing tasks are evenly distributed among processors cm the parallel 
computing platform of the system. 
IS Another object of the present invention is to provide such an improved MOLAP 

method, wherein data elements within the MDB of the system are evenly distributed among the 
processors on the parallel computing platform thereof. 

Another object of the present invention is to provide such an improved MOLAP 
method, wherein each processor on the parallel computing platform handles data elements 
20 assigned thereto during data address assignment operations carried out during parallel data 
loading operations and parallel data aggregation operations within the system. 

Another object of the present invention is to provide such an improved MOLAP 
method, wherein there is no need to exchange data among processors on the parallel computing 
platform. 

25 Another object of the present invention is to provide such an improved MOLAP 

method, wherein the need for interprocessor communication among the parallel processors is 
minimized. 

Another object of the present invention is to provide a new method of generating an 
information directory or index for a multidimensional database (MDB) used in a MOLAP 
30 system. 
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Another object of the present invention is to provide such a method of generating an 
information directory or index for an MDB, wherein data element addresses to data storage 
elemenets thercwithin are generated using (i) modular arithematic functions, (ii) dimensions of 
the MDB and its dimensional hierarchy, and (iii) data variables from the relational database 
5 management system (RDBMS) of the Data Warehouse associated with the MDB. 

Another object of the present invention is to provide an improved decision support 
system which allows knowledge workers to intuitively, quickly, and flexibly manipulate 
operational data using familiar business terms in order to provide analytical insight into a 
business domain of interest 

1 0 Another object of the present invention is to provide a novel method of using a MDB to 

support OLAP systems. 

Another object of the present invention is to provide an improved system and method 
of searching and updating a MDB containing an index of information resources locators (URLs) 
on the Internet, referred to as an MBD-based URL-Index or Directory. 

1 5 Another object of the present invention is to provide such an improved system and 

method of searching and updating a MDB-based URL-Index or Directory, wherein data storage, 
retrieval, updating and shifting operations are carried out within the MDB of the system using a 
hovel modular arithmetic based data element address assignment scheme which involves 
mapping (i) integer-encoded MDB dimensions associated with data elements to be stored in, 

>0 retrieved from or shifted within the MDB, into (ii) integer-encoded data storage addresses 
within the storage volumes thereof. 

Another object of the present invention to provide a novel method of data mapping and 
storage for use in the parallel access of multidimensional data bases, as well as in parallel data 
loading and aggregation operations, and on-the-fly multidimensional queries, while ensuring 

!5 balanced processing and minimizing intetprocessor communication among a plurality of 
processors. 

Another object of the present invention is to provide a method of decomposing, or 
partitioning, an n-dimcnsional database into p modules, where p represents the number of 
processors (i.e. processing module) in the multiprocessing array, D* D h D^, represent n 
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dimensions, and * represents the k-th out of p processing modules, is based on the following 
address data translation (Le. mapping) formula: 

k = (D^t + D, + Df) modp 

Another object of the present invention is to provide such a method, wherein each data 
element is specified by index A, and the entire data domain is decomposed and assigned to the 
Processor (Memory) Space of p processing modules. 

Another object of the present invention is to provide a novel MDB-based Internet URL 
Directory system for supporting on-line information searching operations by Web-enabled 
client machines. 

Another object of the present invention is to provide a novel personalized electronic 
commerce (i.e. on-line) shopping system, in which consumer stopping profile information is 
collected on individual consumers during e -commerce and other transactions, stored in an MBD 
for quick acccess and use in creating Web-enabled personalized shopping environments (e.g. 
personalized Web-stores) in a real-time manner which reflect the interests, tastes, desires and/or 
expectations of the individual customers engaged in on-line shopping activities supported by 
electronic-commerce servers over the Internet 

Another object of the present invention is to provide a novel MDB-based system for 
providing fast, affordable and easy access to customer intelligence, enabling companies to more 
effectively market products and services over the Internet 

Another object of the present invention is to provide a novel MDB-based system that 
enables value-added services to customers running e-commerce enabled Web sites. 

Another object of the present invention is to provide a novel MDB-based system that 
enables improved levels of strategic business analysis and data mining on the Internet. 

Another object of the present invention is to provide a novel MDB-based system that 
enables a company to leverage strategic information on its customers and competitors by 
quickly uncovering hidden patterns and more accurately predicting customer behavior. 

Another object of the present invention is to provide a novel MDB-based system that 

enables fast knowledge discovery and accurate predictive business modeling for applications 

such as database marketing, financial/risk analysis, fraud management, bioinformatics, retum-on- 
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investment (ROO justification, business intelligence applications (e.g. Balanced Scorecard, 
Activity-Based Costing), customer relations management (CRM), enterprise information 
portals and the like. 

Another object of the present invention is to provide a novel Internet-enabled MDB- 
5 based system for supporting real-time control of processes in response to complex slates of 
information reflected in the MDB. 

These and other object of the present invention will become apparent hereinafter and in 
the Claims to Invention set forth herein. 

0 BRIEF DESCRIPTION OF THK BRAfflMflS 

In order to more fully appreciate the objects of the present invention, the following 
Detailed Description of the Illustrative Embodiments should be read in conjunction with die 
. accompanying Drawings, wherein: 

5 Fig. I A is a schematic representation of an exemplary prior art relations on-line 

analytical processing (ROLAP) system comprising a three-tier or layer client/server 
architecture, wherein the first tier has a database layer utilizing relational databases (RDBMS) 
for data storage, access, and retrieval processes, the second tier has an application logic layer 
(Le. the ROLAP engine) for executing the multidimensional reports from multiple users, and the 

) third tier integrates the ROLAP engine with a variety of presentation layers, through which 
users perform OLAP analyses; 

Fig. IB is a schematic representation of a generalized embodiment of a prior art 
multidimensional on-line analytical processing (MOLAP) system comprising an on-line 
transactional processing (OLTP) relational database, a Data Warehouse realized as a relational 

1 database, an OLAP server, a plurality of OLAP clients, and an OLAP multidimensional 
database); 

Fig. 2A is a schematic representation of the Data Warehouse shown in the prior art 

system of Fig. IB comprising numerous data tables (e.g. Tl, T2, .Tn) and data field links, 

and the OLAP multidimensional database shown of Fig. IB, comprising a conventional page 
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allocation table (PAT) with pointers pointing to the physical storage of variables in a 
information storage device; 

Fig. 2B is a schematic representation of an exemplary three-dimensional database and 
organized as a 3-dimensional Cartesian cube and used in the prior art system of Fig. 2A, 
5 wherein the first dimension of the MDB is representative of geography (e.g. cities, states, 
countries, continents), the second dimension of the MDB is representative of time (e.g. days, 
weeks, months, years), the third dimension of the MDB is representative of products (e.g. all 
products, by manufacturer), and the basic data element is a set of variables which are addressed 
by 3-dimensional coordinate values; 
1 0 Fig. 2C is a schematic representation of a prior art array structure associated with an 

exemplary three-dimensional data, arranged according to a dimensional hierarchy; 

Fig. 2D is a schematic representation of a prior art page allocation table for an exemplary 
three-dimensional database, arranged according to pages of data element addresses; 

Fig. 3A is a schematic representation of a prior art MOLAP system, illustrating the 
1 5 process of periodically storing raw data in the Data Warehouse thereof, serially loading of basic 
data from the Data Warehouse to the multidimensional database (MDB), and the process of 
serially pre-aggregating (or precompiling) the data in the multidimensional database along the 
entire dimensional hierarchy thereof; 

Fig. 3B is a schematic representation illustrating that the Cartesian addresses listed in a 
20 prior art page allocation table (PAT) point to where physical storage of data elements (i.e. 
variables) occurs in the information recording media (e.g. storage volumes) associated with the 
MDB, during the loading of basic data into the MDB as well as during data preaggregation 
processes carried out therewithal; 

Fig. 3C1 is a schematic representation of an exemplary three-dimensional database used 
25 in a conventional MOLAP system of the prior art, showing that each data element contained 
therein is physically stored at a location in the recording media of the system which is specified 
by the dimensions (and subdimensions within the dimensional hierarchy) of the data variables 
which are assigned integer-based coordinates in the MDB, and also that data elements 
. associated with the basic data loaded into the MDB are assigned lower integer coordinates in 
30 MDB Space than p re-aggregated data elements contained therewithin; 
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Fig. 3C2 is a schematic representation illustrating that a conventional hierarchy of the 
dimension of time typically contains the subdiinensions days, weeks, months, quarters, 
etc. of the prior art; 

Fig. 3C3 is a schematic representation showing how data elements having higher 
5 subdiraensions of time in the MDB of the prior art are typically assigned, increased integer 
addresses along the time dimension thereof; 

Fig. 4 is a schematic representation illustrating that, for very large prior ait 
multidimensional databases, very large page allocation tables (PATs) are required to represent 
the address locations of the data elements contained therein, and thus there is a need to employ 
10 address data paging techniques between the DRAM (e.g. program memory) and mass storage 
devices (e.g. recording discs or RAlDs) available on the serial computing platform used to 
implement such prior art MOLAP systems; 

Fig. 5 is a graphical representation showing how search time in a conventional (i.e. prior 
art) multidimensional database increases in proportion to the amount of preaggregation of data 
IS therewithin; 

Fig. 6 is a schematic representation of a generalized MOLAP system, wherein a parallel 
computing machine is used to realize the MDB thereof using any one of several prior ait data 
element addressing methods; 

Fig. 7A is a schematic representation illustrating a first prior art method of data element 
20 address assignment which involves the partitioning a conventional 4-D array of data by splitting 
the multidimensional data according die lowest dimension of the MDB, wherein this method 
can be used during both data clement loading and preaggregation processes subject to the 
shortcomings and drawbacks set forth in Fig. 7C; 

Fig. 7B is a schematic representation illustrating a second prior art method of data 
25 element address assignment in accordance with the present invention which involves 

partitioning a conventional 4-D anay of data by splitting the multidimensional data according 
the highest dimension of the MDB, wherein this method can be used during both data element 
loading and preaggregation processes subject to the shortcomings and drawbacks set forth in 
Fig.7C; 
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Fig. 7C is a table setting forth the shortcomings and drawbacks associated with the prior 
art data element address assignment methods depicted in Figs. 7A and 7B; 

Fig. 8A is a schematic representation illustrating a preferred method of data element 
address assignment in accordance with the present invention, implemented on the parallel 
5 computing platform of Fig. 6„ and involving the generation of a set of (memory) page allocation 
tables (PATs) by mapping the dimensions of the MDB into physical storage addresses using 
modular integer-based arithmetic; 

Fig. 8B is a schematic representation of the method of Fig. 8A, indicating that the inputs 
to the mapping (i.e. translation) function employed in die data address assignment method of 
(0 Fig. 8A are the MDB dimensions and dimensional hierarchy and variables in the RDBMS 
database, and that the outputs from the mapping function are a set of page allocation tables 
(PATs) preassigned to the plurality of processors associated with the parallel computing 
machine (Le. platform) shown in Fig. 6; 

Fig. 8C is a schematic representation of a MOLAP system in accordance with the 
15 present invention, illustrating the process of periodically storing raw data in the Data 

Warehouse thereof, parallelly loading basic data from the Data Warehouse to the MDB, and the 
parallel process of pre-aggregating (or precompiling) the data in the MDB along the entire 
dimensional hierarchy thereof; 

Fig. 9A1 illustrates the result of the process of data elements address assignment 
» employed during the process of data element loading, between the MDB space of a 4-D MDB 
and the processor (memory) space of 4 processors (p=4) on a parallel machine operating in 
accordance with the present invention, showing uniform distribution of data elements of the 
MDB among processors; 

Fig. 9A2 illustrates the result of the process of data element address assignment 
5 employed during the process of data element loading, between the MDB space of a 4-D MDB 
and the processor (memory) space of 3 processors (p=3) on a parallel machine operating in 
accordance with the present invention, showing uniform distribution of data elements of the 
MDB among processors; 

Fig. 9 A3 illustrates the result of the process of data elements address assignment during 
0 the process of data element loading, between the MDB space of a 4-D MDB and the processor 
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(memory) space of 3 processors (p=5) on a parallel machine operating in accordance with the 
present invention, showing uniform distribution of data elements of the MDB among 
processors; 

Fig. 9A4 illustrates the result of the process of data elements address assignment 
5 employed during the process of data element loading, between the MDB space of a 4-D MDB 
and the processor (memory) space of 6 processors (p=6) on a parallel machine operating in 
accordance with the present invention, showing uniform distribution of data elements of the 
MDB among processors; 

Fig. 1 OA is a schematic representation of the MO LAP system of the present invention 
10 shown in Fig. 6, illustrating the parallel loading of basic data from the Data Warehouse to the 
MDB supported on the parallel computing machine, using a plurality of software drivers 
provided for in the OLAP server of the MOLAP system; 

Fig. 1 OB is a schematic representation of the MOLAP system of the present invention 
shown in Fig. I OA, illustrating the parallel loading of basic data from the Data Warehouse to the 
1 5 MDB supported on the parallel computing machine, using the data element address assignment 
method shown in Fig. 8A; 

Fig. 1 1 A is a schematic representation illustrating the internal addressing of data 
elements in the Storage Space of processor pO, in the particular case of Fig. 9A1 , in accordance 
with the present invention; 
20 Fig. 1 1 B is a schematic representation of an array of data elements in the Storage Space 

of {»ocessor pO (of Fig. 11A), arranged according to the present invention; 

Fig. 1 IC is a schematic representation of a Page Allocation Table of processor Po, for 
die exemplary array of Fig. 1 1 B, arranged according to pages of data element addresses; 

Fig. 12A is a schematic representation depicting the parallel data aggregation process of 
25 the present invention, shown carried out on a parallel machine of the type shown in Fig. 6 

having 4 processors (p=4) operating in accordance with the present invention, and showing that 
partial aggregation results from the processors are concatenated into a final result by 
interprocessor communication provided for within the parallel computing machine; 

Fig. 12B1 is a schematic representation depicting the parallel-based method of pre- 
30 aggregation according to the present invention, illustrating that each processor on a four- 
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processor parallel computing machine of the type shown in Fig. 6, is assigned about the same 
number of data elements during data aggregration along the O0=2 subspace of a 3-D MDB 
supported thereupon; 

Fig. 12B2 is a schematic representation depicting die parallel-based method of pre- 
5 aggregation according to the present invention, illustrating that each processor on a four- 
processor parallel computing machine, of the type shown in Fig. 6, is assigned about the same 
number of data elements during data aggregration along the D3=l subspace of a 3-D MDB 
supported thereupon; 

Fig. 12B3 is a schematic representation depicting the parallel-based method of 
1 0 preaggregation according to die present invention, illustrating that each processor on a four- 
processor parallel computing machine, of the type shown in Fig. 6 is assigned about the same 
number of data elements during data aggregration along the D 1 = 1 subspace of a 3-D MDB 
supported thereupon; 

Fig. 1 2B4 is a schematic representation depicting the parallel-based method of 
15 preaggregation according to the present invention, illustrating that each processor on a four- 
processor parallel computing machine, of the type shown in Fig. 6, is assigned about the same 
number of data elements during data aggregration along the D2= I subspace of a 3-D MDB 
supported thereupon; 

Fig. 12CI is schematic representation depicting the aggregation procedure of the present 
20 invention carried out within a 3-dimensional MDB, where every single data element of the base 
data is summed up to the pre-aggregated data along each of the dimensions, and is handled only 
once during the entire data aggregation process of the present invention; 

. Fig. 1 2C2 is schematic representation of the aggregation procedure of the present 
invention carried out within a 5-dimensional MDB, where every single data element of the base 
25 data is summed up to the ore-aggregate data along each of the dimensions, and is handled only 
once in the entire aggregation process; 

Fig. 12C3 is a schematic representation of the "Storage Space" of a single processor in 
the parallel operating machine of the present invention, illustrating, during the aggregation 
process, that most of the data is in a compressed state, in order to save memory/disk space and 
30 handling times, that all disk data is in a compressed state, that the data in the main memory is 
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kept in two levels, namely compressed and open, that the aggregation program works directly 
on the open level, and that the open data, according to space availability, is compressed and 
moved to the Disk Space; 

Fig. 1 3 A is schematic representation of the parallel-based method ofpre-aggregation 
S according to the present invention, illustrating the first aggregation level L=0 along dimension 
Di, where all p processors on the parallel computing platform are participating in the data 
aggregation process; 

Fig. 13B is schematic representation of the parallel-based method of preaggregation 
according to the present invention, illustrating the Lm-th aggregation level along dimension Di, 
10 where all j processors on the parallel computing platform are participating in the data 
aggregation process; 

Figs. 14A through 14B3 set forth a series of schematic representations illustrating that 
data element loading and aggregation processes of the present invention can be carried out with 
an MDB having any dimensionality or hierarchy of dimensionality, provided that each unit of 
15 dimensionality in the MDB is indexed using integer-based arithmetic; 

Fig. ISA is a schematic representation of an Internet URL directory system according to 
the present invention, wherein a parallel computing machine is used to realize the MDB-based 
URL Directory (or Index) thereof using any one of several possible types of data element 
addressing methods in accordance with the principles of the present invention, whereas a 
20 relational database management system (RDBMS) is used to realize the Internet URL 
registration subsystem thereof; • 

Fig. 1 SB is a schematic representation of an exemplary three-dimensional database of an 
Internet URL Directory organized as a 3-dimensional Cartesian cube according to the prsent 
invention, wherein the first dimension of the MDB is representative of Health, the second 
25 dimension of the MDB is representative of Arts and Humanities, the third dimension of the 
MDB is representative of Education, and the basic data element is a set of WWW (e.g. HTML 
or XML) Pages which are addressed by 3 -dimensional coordinate values; 

Fig. 16 is a schematic representation of die parallel computing machine of Fig. 1 5 A, 
illustrating the parallel loading of data from the RDBMS-based URL registration Data 
30 Warehouse, to the MDB-based URL Directory supported on the parallel computing machine, 
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using the data element address assignment method (i.e. add/ess data mapping method) shown in 
Fig.8A; . 

Fig. 17 is a schematic representation of a personalized on-line e-commerce shopping 
system according to the present invention, wherein a parallel computing machine, of the type 
5 shown in Fig. 6, is used to realize the consumer shopping profile MDB based thereof; whereas 
a RDBMS is used to realize the consumer shopping profile Data Warehouse thereof; and 
Fig. 18 is a schematic representation of the parallel computing machine of Fig. 17, 
illustrating the parallel loading of data from the consumer shopping profile Data Warehouse, to 
the consumer shopping profile MDB supported on the parallel computing machine, using the 
1 0 data element address assignment method (i.e. address data mapping method) shown in Fig. 8A. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS OF THE PRESENT 

1 5 Referring now to Figs. 6, and 8A through 14B3, the preferred embodiments of the 

method and system of the present invention will be now described in great detail hereinbelow, 
wherein like elements in the Drawings shall be indicated by like reference numerals. 

In genera), the address data mapping method and apparatus of the present invention can 
be employed in a wide range of applications, including MOLAP systems, Internet URL- 

20 directory systems, personalized on-line e-commerce shopping systems, Internet-based systems 
requiring real-time control of packet routing and/or switching, and the like. For purposes of 
illustration, initial focus will be accorded to improvements in MOLAP systems, in which 
knowledge workers are enabled to intuitively, quickly, and flexibly manipulate operational data 
within a MDB using familiar business , expressed in order to provide analytical insight into a 

25 business domain of interest Thereafter, an improved system and method of accessing 
information on the WWW using an Internet-based URL directory will be disabled then, a 
personalized e-commerce shopping system will be described. Other applications will also be 
discussed 
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Method of Assigning Data Elements In The MDB To Specific Processors On A Parallel 
CornpMtingP|atf Qm i 

The MOLAP system and method of the preferred embodiment can be realized using the 
5 parallel computing machine shown in Fig. 6 and described above, in combination with the 
teachings of the present invention discloesd herein. As best illustrated in Fig. 8A, the data 
elements contained within the MDB of the MOLAP system are located within MDB Space at 
address locations specified by the integer-encoded dimensions of the data elements in the 
MDB. As schematically depicted in Fig. 8A, these data elements within die MDB are 
10 physically stored in the storage volumes of the MDB (in Processor Storage Space) at storage 
locations specified by integer-encoded addresses that are generated using the novel address data 
translation/mapping process of the present invention. To carry out each of the many operations 
associated with managing the MDB system of the present invention, the novel method of data 
element address assignment, schematically illustrated in Fig. 8 A, is used within the MOLAP 
IS system. 

As illustrated in Figs. 8A and 8B, the method of data element address assignment 
involves generating a data structure arranged as a set of page allocation tables or PATs. Each 
PAT, generated for and assigned to a particular processor P k on die parallel computing platform 
shown in Fig. 6, contains information relating (i) the n-dimensional integer-based Cartesian 

20 addresses of data elements in the MDB assigned to the specific processor P*. to (ii) the integer- 
based physical address locations where the corresponding data elements (i.e. data records) are 
stored in the storage volumes associated with the specific processor P k . In other words, each 
PAT assigned to a specific processor P k provides a one-to-one mapping between (i) each 
integer-based Cartesian address location in the MDB assigned to the processor P k , and (ii) an 

25 unique integer-based physical-storage address location in the storage volume of the specific 
processor P k . 

In general, the physical storage address of each data element in the MDB (listed in its 
corresponding PAT) is generated by a two-step process comprising: first assigning each data 
element in the MDB (or more precisely, each integer-based logical/Cartesian address location in 
30 the MDB) to a specific processor P k ; and then generating a unique integer-based data storage 
address location within the physical storage volume of the specified processor P k . 



SUBSTITUTE SHEET (RULE 26) 



WO 01/11497 



PCMBOO/01100 



Fig. 8A schematically depicts the process of assigning each data element in a three- 
dimensional MDB (or more precisely, each integer-based logical/Cartesian address location in 
the 3D MDB) to a specific processor P k on the parallel computing platform. As illustrated in 
Fig* 8 A, this process of processor assignment involves using the following integer-based 
5 modular arithmetic function, namely: k=(D2 + Dl + DO) mod p for the case of a 3-D MDB, 
wherein: 

k is the processor identity index (i.e. processor number) asssociated with the k- 
processor s data Storage Space; 

D2, Dl and DO are the first, second, and third (business) dimensions of the MDB, respectively; 
10 and 

. i 

, p is the number of processors employed on the parallel computing platform of the 
system. 

For the general case of a n dimensional MDB, processor assignment is carried out using 
the following integer-based modular arithematic function : 

15 

k = (Dn +D2 + Dl+D0)modp (1) 

As shown in Fig. 8B, the inputs to the mapping function employed in the data address 
assignment method of Fig. 8A are: (i) modular arithematic functions); (ii) the dimensions of the 
20 MDB and its dimensional hierarchy; and (iii) data variables from the relational database 
management system (RDBMS) of the Data Warehouse associated with the MDB. As 
illustrated in Fig. 8B, the outputs from the mapping function are a set of page allocation tables 
(PATs) generated for the plurality of processors associated with the parallel computing 
machine of Fig. 6. 

25 Notably, in the PAT generation process of Fig. 8B, each processor P k generates a unique 

integer-based physical storage address for storing each assigned data element within the 
physical storage volume of the specified processor P k . In the case of a four processor (p=4) 
computing platform, each processor P k generates a unique integer-based physical storage 
address for the assigned data element using the following local address generation formula: 

30 
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Loc(Do»D|, DiSh) = [ (Do+sizc D 0 * (D,+sizeD,* (Dj+sizeD^Dj)))?] 

wherein: p represents the Number of processors; Di represents the running index of the i-th 
dimension; and Size D represents the modified size of i-th dimension by Size D = int((sizeEH/>- 
5 (wherein int implies truncating the result is to an integer value). 

In the general case of an n-dimensional MDB, each processor P k generates a unique integer- 
based physical storage address for each data element assigned by formula (1) above using the 
following local address generation formula: 

10 Loc(Do,D„ ,Do) = [ (D 0 +size D 0 * (D,+sizeD,* U^DJ) )lp\ (2) 

Method of Loading Basic Data From Data Warehouse To MQLAP Parallel Machine 

In Fig. 9A 1, there is shown a four-dimensional database which is generated during the 

1 5 parallel data element loading process of Fig. 10A which will be detailed hereinafter. As shown 
in Fig. 9A1, the parallel data loading process is carried out between the RDBMS-based Data 
Warehouse of the system and the 4-D MDB supported on a four processor (p=4) parallel 
machine, as shown in Fig. 6 , operated in accordance with the principles of the present 
invention. Notably, the parallel data element address mapping process depicted in Figs. 8A and 

20 8B, and characterized by the modular arithmetic formula (1) set forth above, is employed during 
the parallel data element loading process of Fig. 10A, and employs a plurality of software 
drivers provided for within the OLAP server of the system. 

As shown in Figs. 9A1 through 9A4, the result of the mapping process during data 
loading operations is a uniform distribution of data elements of the MDB among processors. 

25 For the small example shown in Fig. 9A1, the amount of data elements mapped to each 

one the four processors is about the same, e.g. about 26 to 28. The largest possible variance is 
smaller than p (i.e. the number of processors on the parallel computing platform). For a realistic 
case, in which millions to billions of data elements are counted, such a variance is negligible. 
Moreover, the method is scalable only if any number of processors can be employed, without 

30 loosing the uniformity of distribution. Figs. 9A1 to 9A4 show that the system and method of 
the present invention are scalable by the capacity thereof to evenly distribute data elements 
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among a varying number of processors on the parallel computing platform. For example, Fig. 
9A2 illustrates uniform distribution of data that results when the the data set is loaded among 
three (3) processors (p=3) on the parallel computing platform, wherein 36 data elements are 
assigned to each processor. Fig. 9A3 illustrates that uniform distribution of data results when 

5 the data set is loaded among five (5) processors (p=5), wherein 20 to 23 elements are distributed 
to each processor. Fig. 9A4 illustrates that uniform distribution of data results when the data is 
loaded among six (6) processors (p=6), wherein 18 data elements are distributed to each 
processor. A comparison of Figs. 9A1 to 9A4 demonstrates that the address data mapping 
process of the present invention is scalable to any dimension MDB without causing a decrease 

10 in system performance. 

in Fig. 1 OA, the process of loading basic data from the RDBMS-based Data Warehouse 
to the MOLAP parallel machine is shown carried out in a parallel manner in accordance with the 
principles of the present invention. As shown therein, each one of the loading processes is 
handled by a separate processor. This process of parallel data loading is illustrated in greater 

15 detail in Fig. 10B. Each processor governs its own subspace, according to the mapping scheme 
of Figs. 8A and 8B, characterized by formulas (1) and (2) set forth hereinabove. The relational 
data base access module or manager (RDBAM) shown in Fig. 10B is a software tool used to 
define the mapping between relational and multi-dimensional models employed in the system of 
the present invention. A commerically available RDBAM subsystem is sold by the Oracle 

20 Corporation, under the tradename ORACLE Express Relational Access Manager (RAM), 
described in detail at the uniform resource loacator (URL) 
http://www.oracle.com/olap/collatryramds.odf. 

The function of the RDBAM is to generate the address of a basic data element based on 
Warehouse Metadata contained in a Data Warehouse Metadata directory, and then access the 

25 basic data element from within the set of relational lists comprising the RDBMS-based Data 
Warehouse. Warehouse Metadata contained in the Warehouse Metadata directory consists of 
information describing the Warehouse data contained in the RDBMS-based Data Warehouse, 
and is stored together with the Warehouse data. The function of the Data Warehouse Metadata 
directory is to describe the detailed structure of the relational data (Star, Snowflakes, eta), 

30 dimensions, and hierarchy relations associated with the RDBMS-Data Warehouse. Using the 
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Warehouse Metadata directory, every candidate element of the MDB may be found in the set 
of lists of the RDBMS-based Data Warehouse. For example, for a 4-D MDB, where a data 
element is defined by four coordinates, its value will be collected from several 2-D relational 
lists in the RDBMS-based Data Warehouse using the Warehouse Metadata directory. 
5 At the beginning of the data loading process, illustrated in Fig. 10B, the relational map of 

the Warehouse Metadata directory (i.e. at the initialization phase), and the Multidimensional 
Database (MDB) map of the present invention, as defined by Fig. 8A, are toaded into the 
RDBAM associated with each processor P k . All the communication between RDBAM and the 
RDBMS-based Data Warehouse is carried out by means of SQL language. For each processor 

10 the Warehouse Metadata directory will be the same. However, the MDB map will be 

different, in order to properly map the subspace assigned to the specific processor, according to 
the modular mapping scheme of the present invention. The Data Warehouse Metadata directory 
is then used to generate the address of a basic data element stored within the set of relational 
lists comprising the RDBMS-based Data Warehouse. After accessing its data value, the data 

15 element is physically stored in a mass storage device on the parallel computing machine, 

according to the Page Allocation Table assigned to the processor which addressed the basic data 
element In the case of a four processor (p=4) computing platform, the local address within the 
P k processor's storage volume is computed using the modular arithmetic formula (2) set forth 
above. For the particular case of Fig. 9A1, Fig. 1 1 A illustrates the internal mapping of data 

10 elements in processor pO. In the general case of an n-dimensional MDB, the local address 
within the P k processor s storage volume is computed by formula (2) set forth above. 

Method Of Aggregating Data Within the MDB of the Present Invention 

IS As illustrated in Fig. 12A, the preaggregation process of the present invention 

supported on a parallel machine, as shown in Fig. 6, involves carrying out partial aggregations at 
each processor, and then concatenating the partial results into a final result by means of 
interprocessor communication enabled by the parallel machine. 
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Figs. 12B I to 12B4 schematically depict the uniform load balancing characteristics that 
are achieved during preaggregation carried out by the method of the present invention. This 
process will be detailled hereinbelow. 

As shown in Fig. 12B1, when carrying out the parallel-based method of data 
5 preaggregation according to the present invention, each processor P k in a four-processor parallel 
computing machine is assigned approximately the same number of data elements during data 
aggregation along the D0=2 multidimensional cross-section within a 3-D type MDB supported 
thereupon. This is in marked contrast with the method of Fig. 7 A, wherein-all the pre- 
aggregated data is located in a single processor's subspace, necessitating that sequential 
10 processing be be carried out 

As shown in Fig. 12B2, when carrying out the parallel-based method of preaggregation 
according to the present invention, each processor on a four-processor parallel computing 
machine is assigned substantially the same number of data elements during data aggregation 
along the D3=l multidimensional cross-section within a 3-D type MDB supported thereupon. 
IS This is in marked contrast with die method of Fig. 7B where all the pre-aggregated data is 

located in two processor subspaces. Thus, the parallel-based method of the present invention 
eliminates the possibility of load unbalancing among processors. 

As shown in Fig. t2B3, when carrying out the parallel-based method of preaggregation 
according to the present invention, each processor on a four-processor parallel computing 
20 machine is assigned substantially the same number of data elements during data aggregation 
along the Dl=l multidimensional cross-section within a 3-D type MDB supported thcreupoa 
As shown in Fig. 12B4, when carrying out the parallel-based method of preaggregation 
according to the present invention, each processor on a four-processor parallel computing 
machine is assigned substantially the same number of data elements during data aggregation 
25 along the D2=l multidimensional cross-section within a 3-D type MDB supported thereupon. 
Referring to Figs. 12C1 to 1 2C3, the parallel data aggregation process of the present 
invention will now be described in greater detail In Fig. 12C1, the parallel data aggregation 
process is illustrated for the case of a 3-D type MDB. Every data element of the base (or raw) 
data in the MDB is summed up to produce the pre-aggregate data of die next hierarchy in each 
30 of the dimensions of the MDB. Every data element is handled only once during the data 

•27- 



SUBSTITUTE SHEET (RULE 26) 



WO 01/11497 



PCMBOO/01100 



aggregation process. The same is done with other data elements obtained from the base data of 
the Data Warehouse, wherein each data element is summed up to the next hierarchy of 
dimensions. 

In Fig. 12C2, the parallel data aggregation process of the present invention is illustrated 

5 for the case of a 5-D type MDB. As shown in Fig. 12C3, in order to save space and I/O access 
time, the data is maintained mostly in a compressed state. Only the directly processed data 
handled by the aggregation program is maintained in a non-compressed state. Otherwise, all 
other data elements on the disk and in the main memory are maintained in a compressed state. 
Moving the data between main memory and the disk associated with each processor is 

10 accomplished in a virtual memory fashion, well known in the computing art 

Referring to Figs. I3A and 13B, the parallel-based method of data pre-aggregation 
according to the present invention will now be described in greater detail. In Fig. 13 A, the first 
aggregation level L=0 along dimension Di is schematically depicted, where allp processors on 
the parallel computing platform are shown participating in die aggregation process of die loaded 

15 (basic) data. The partial results of the aggregation process are concatenated, and then stored 
according to the address data mapping scheme of the present invention described in detail 
hereinabove. Notably, not all p processors are necessarily involved in generating the resulting 
data set, but rather only a subset of the p processors, indicated by the index/, specified by 
processor indices: P k to P[k+</-t)mod/>]- These processors will participate in the second level of 

20 aggregation, where the results are stored according to the address data mapping scheme of the 
present invention. 

In Fig. 13B, the Lm-th aggregation level along dimension Di is schematically depicted, 
where all j processors on the parallel computing platform are participating in the data 
aggregation process. As in Fig. 1 3 A, the partial results of the aggregation process are 
25 concatenated, and then stored according to the address data mapping scheme of the present 
invention described in detail hereinabove. 

In Figs. 14A through 14B3, an inductive proof is provided to demonstrate that the data 
element loading and aggregation processes of the present invention can be used with an MDB 
having any dimensionality or hierarchy of dimensionality, provided that each unit of 
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dimensionality is indexed using integer-based arithmetic, in accordance with the requirements of 
the address data mapping technique of the present invention described herein above. 

Fig. 14A illustrates the case of a 3-D MDB, wherein the dimensions D0-D2 form a 
cube. As shown therein, the address data mapping scheme of the present invention can be 
5 applied to this 3-D type MDB, whereby the cube of data is evenly divided among three 

processors. Fig. 14B illustrates that the address data mapping scheme of die present invention 
can be applied to any n-dimensional MDB. In Fig. 14B1, the address data mapping method is 
shown applied to 3-D data cube (i.e. 3-D type MDB), whereas in Fig. 14B2 the address data 
mapping method is shown applied to a 4-D data cube (i.e. 4-D type MDB) which is merely a 

10 multiplication of 3-D data cubes. In 14B3, the method is shown applied to a 5-D data cube (Le. 
5-D type MDB) which is simply a multiplication of 4-D data cubes, etc. Collectively, these 
illustrations prove, inductively, that the address data mapping scheme of the present invention 
can be applied to any n-dimensional MDB without any significantly compromising the 
performance of the system, and thus can be said to be highly-scalable in the computational 

IS sense. 

Other Applications of th^ Present Invention 

While the system and method of address data translation of the present invention has 
20 been applied above to provide a novel way of and means for of carrying out MOLAP 

operations, it is understood that this method can be used in numerous other data management 
operations as well. 

MBD-based URL-Directorv of The Present Invention 

25 

For example, the address data mapping method of the present invention can be used to 
provide an improved system and method of searching and accessing an index of information 
resource locators (URLs) on the Internet, referred to hereinafter as an MBD-based URL-Index 
or Directory system, denoted by reference numeral 20 in Fig. 1 5 A. In system 20, there may or 
30 may not be any need for data aggregation as in the above-described MOLAP application, shown 
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in Figs.6-14B3. In system 20, the dimensions of flic MDB will be selected on the basis of the 
URL/Wcb-site classification scheme embodied within the structure of the URL directory. For 
example, referring to the Yahoo® Internet Information Resource Directory located at 
http^www,y»hpo.cftm , it is noted that as of June 28, 1999, the URL classification scheme 
employed by this particular URL directory includes, at its top level scheme, the following 
twelve (12) information categories: Arts and Humanities; Business & Economy; Computers 
^Internet; Education; Government; Health; News & Media; Recreation & Sports; Reference; 
Regional; Science; Social Science; and Society & Culture. These information categories can be 
defined as the high-level dimensions of the MDB of this embodiment of the present invention. 
Referring to the information category Arts and Humani ties , which would be one of the high- 
level dimensions of the MDB, it is noted that as of June 28, 1999, this high-level information 
category has the following information sub-categories: Art History; Artists; Arts Therapy; 
Awards; Booksellers; Censorship; Chats and Forums; Companies; Crafts; Criticism and 
Theory; Cultural Policy; Cultures and Groups; Design Arts; Education; Employment; Events; 
Humanities; Institutes; Museums, Galleries, and Centers; News and Media; Organizations; 
Performing Arts; Reference; Thematic; Visual Arts; Web Directories. Notably, these 
information subcategories would be defined as the subdimensions below the dimension Arts & 
Humanities . As shown in Fig. 15B, each of the subdimensions defined above can be further 
decomposed into additional information categories, as revealed at the Yahoo Website. Based 
along the above-described lines, an MDB-based URL directory, as described along the lines set 
forth above, can be constructed in a straightforward manner in accordance with the principles of 
the present invention. 

As shown in Fig. 1 5 A, the MDB-based URL directory system 20 comprises: an 
Internet (i.e. http) information server (e*g. Origin 2000 Server from Silicon Graphics, Inc.) 21 
connected to the infrastructure of the Internet, a back-end parallel computing system 22 for 
supporting the MDB-based URL directory described above, and an OLAP server 23, operally 
connected to Internet information server 21 by a high-steel information network 24. The 
MDB-based URL directory is interfaced with the http information server 2 1 by way of a 
common gateway interfecc (CGI) , Java-scripts, or like processes well known or otherwise to 
be developed in the art Information contained in the MDB-based URL directory is accessible 
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by any web enabled client machine 25 operably connected to the infrastructure of the Internet 
26, in a manner known in the art 

As shown in Fig. 1 5 A, the MDB-based URL directory system 20 also includes an 
information registration subsystem 27 comprising an Internet (Le. http) information server 28, 
5 connected to a relational database management system (RDBMS) 29 realized using a robust 
database development program, such as Oracle 8i from the Oracle Corporation. The main 
function of the information registration subsystem 27 is to enable owners or agents of Internet- 
based information resources (e.g. HTML documents, XML documents, and the like) to register 
such information resources with the MDB-based URL Directory in accordance with the current 

10 information classification scheme being employed by the directory system 20. Anyone having 
a Web (http) enabled client machine 25, equipped with a http browser, can register Internet 
documents with the subsystem 27 in a quick and simple manner by accessing ah HTML- 
encoded form from the http server 28, completing the form and returning the same thereto for 
analysis and updating the RDBMS. Alternatively, Web-enabled EDI processes can be used 

15 between the client machines 25 and the http server 28, properly EDI-enabled, in order to 

transfer URL information to the RDBMS-based URL Registration Data Warehouse 29. With 
the above-described system arrangement, the RDBMS 29 can be continuously updated during 
the course of the day, and then used to update the MDB-based URL at predetermined times 
during the day and/or evening when peak demand for directory services is expectedly reduced. 

20 As shown in Fig. 1 6, MDB-based URL directory of the system 20 is updated by 

loading data elements from the RDBMS Data Warehouse 29 into the storage volumes 5 of the 
MDB URL Directory using the novel rnodular-arithematic based address data mapping scheme 
of the present invention, described in detail hereinabove. Also, updating operations of the 
MDB-based URL directory will typically require the shifting of data elements within the 

25 MDB, using the address data mapping scheme as well, in order to reflect any changes made in 
the information classification scheme since the last updating operation. 

Personalised E-Comrpgrce On-Line Shopping System of the Present Wnti on 
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The address data mapping method of the present invention can be used to provide an 
improved system and method of generating personalized e-commerce-enabled (on-line) 
shopping environments (Le. personalized e-stores) using information accessed from an MBD 
containing consumer shopping profile information. In such an application shown in Figs. 17 
5 and 1 8, there may or may not be any need for data aggregation as in tiie above-described 
MOLAP application. 

As shown in Fig. 17, the personalized on-line shopping system 30 of the present 
invention comprises: a RDBMS-based consumer shopping profile Data Warehouse 3 1 for 
storing consumer shopping profile information (e.g. representative of buying patterns, interests, 
10 hobbies as a function of time, personal information, credit history, income, home and auto 

ownership, marital status, etc.) collected from electronic commerce based transactions, compiled 
databases, publicly-traded response databases and the like; a consumer shopping profile MDB 
32,.realized on parallel computing flat form similar to 2 in Fig. 6. for storing consumer 
shopping profile data, loaded from the Data Warehouse 31; an Express™ OLAP server 33; and 
15 one or more electronic commerce information servers 34 connected to the Data Warehouse 3 1 
and the Express OLAP server 33 , for supporting one or more personalized on-line shopping 
WWW sites over the Internet, and transferring consumer transaction records to the Data 
Warehouse 31 after each consumer transaction. The dimensions of the MDB 32 will be selected 
on the basis of the consumer shopping profile attributes mined from RDBMS Data Warehouse 
10 31. 

As shown in Fig. 1 8, first step of the personalized on-line shopping method hereof 
involves collecting consumer shopping profile information (e.g. representative of buying 
patterns, interests, hobbies, personal information, credit history, income, home and auto 
ownership, marital status, etc.) from electronic commerce based transactions, compiled 

15 databases, publicly-traded response databases and the like, and storing such information within 
the RDBMS-based Data Warehouse 31, as shown in Fig. 17. 

The second step of the personalized on-line shopping method involves loading raw 
consumer shopping profile information from the Data Warehouse 31 to the MDB 32 using the 
parallel computing platform and parallelized data loading processes of the present invention 

10 described in great detail hereinabove. 
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The third step of the personalized on-line shopping method involves preaggre gating (i.e. 

precompiling) the consumer shopping profile information within the MDB 32 using the parallel 

computing platform and parallelized data aggregation processes of the present invention, 

described in great detail hereinabove. 
5 The fourth step of the personalized on-line shopping method involves identifying, 

during each on-line shopping transaction, die Web-enabled consumer engaged in on-line 

shopping through a particular WWW site, using a Web-enabled client machine 35 equipped with 

an http client (browser) program and, connected to the infrastructure of the Interact 36. 

The fifth step of the personalized on-line shopping method involves accessing from the 
10 MDB 32, personal shopping information maintained on the identified consumer/shopper, and 

using the same, in order to construct On real-time) personalized (i.e. customized) Web-pages 

that subject the consumer to a personalized shopping environment that reflects his or her 

interests, tastes, desires, values and/or expectations. 

The sixth step of the personalized on-line shopping method involves analyzing, at the 
15 end of each such transaction, the collected set of data collected on the consumer from his or her 

shopping and/or browsing activities, in order to mine for particular consumer shopping 

attributes preclassified within the RDBMS Data Warehouse 31. 

The seventh step of the personalized on-line shopping method involves storing such 

analyzed data within the RDBMS Data Warehouse 3 1 . 
20 The eighth step of the personalized on-line shopping method involves using the 

Express™ Server 33 to periodically upload the data from the continuously updated Data 

Warehouse 3 1 into the MDB 32. 

Thereafter if necessary, the raw data loaded into the MDB 32 is pre-aggregated. The 

information stored within the MDB subsystem 32 reflects current personal shopping profiles 
25 of the consumers (e.g. consumer and consumer households alike) represented therewithin. 

<M(T Applications Qf the Present ftiveptjoq 

It is contemplated that the address data mapping processes of the present invention can 
30 be embodied within the MDB subsystem 32 used to manage multiple dimensions of 

-33- 



SUBSTTTUTE SHEET (RULE 26) 



WO 01/11497 



PCT/IBOO/01100 



information for real-time control of packet routes, switches and other devices used within the 
infrastructure of the Internet The advantage of using the MDB subsystem 32 of the present 
invention is that pre-aggregated information contained therein cane be quickly accessed in real- 
time to control events on the Internet in a real-time manner. 
5 The address data mapping processes of the present invention can be embodied within an 

MDB used to manage multiple dimensions of information for real-time control of automated 
parcel (e.g. package) routing and sortatton systems so that packages automatically identified, 
dimensioned and weighed while being transported along a conveyor belt, can be routed to their 
destinations along a least-cost shipping route based on a hierarchy of information dimensions 

1 0 reflected within the MDB of the system. 

The address data mapping processes of the present invention can be embodied within an 
MDB subsystem used a MOLAP environment for answering questions about corporate 
performance in a particular market, economic trends, consumer behaviors, weather conditions, 
population trends, or the state of any physical, social, biological or other system or 

1 5 phenomenon on which different types or categories of information, organizablc in accordance 
with a predetermined dimensional hierarchy, are collected and stored within a RDBMS of one 
sort or another. Regardless of the particular application selected, the address data mapping 
processes of the present invention will provide a quick and efficient way of managing a MDB 
and also enabling decision support capabilities utilizing the same in diverse application 

20 environments. 

It is understood that the System and Method of the illustrative embodiments 
described hereinabove may be modified in a variety of ways which will become readily apparent 
to those skilled in the art of having the benefit of the novel teachings disclosed herein. All such 
modifications and variations of the illustrative embodiments thereof shall be deemed to be 
25 within the scope and spirit of the present invention as defined by the Claims to Invention 
appended hereto. 
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H AIMS TO INVENTION: 

1 . A system for accessing data elements within a multidimensional database 
(MDB) comprising: 

5 a parallel computing platform having a plurality of processors and one or more storage 

volumes for physically storing data elements therein at integer-encoded physical addresses 
specified in Processor Storage Space, and wherein the location of each data element in said 
MDB is specified in MDB Space by integer-encoded business dimensions associated with said 
data element; 

10 an address data mapping mechanism , associated with said parallel computing platform, 

for mapping the integer-encoded MDB dimensions associated with each said data element into 
an integer-encoded data storage address within said storage volumes associated with the MDB; 
and 

a data accessing mechanism, in cooperation with said address mapping mechanism, for 
15 accessing said data element in said one or more storage volumes using said integer-encoded data 
storage address. 

2. The system of claim 1 , wherein said address data mapping mechanism comprises 
means for mapping said integer-encoded MDB dimensions into said integer-encoded data 

20 storage address using a modular ari thematic function. 

3. The system of claim 2, wherein parallel data loading operations are carried out 
between a relational database management system (RDBMS) and said MDB using said modular 
ari thematic function which maps said integer-encoded MDB dimensions associated with each 

r 

25 raw data element accessed from said RDBMS, into an integer-encoded data storage address 
within one of said storage volumes associated with said MDB. 

4. The system of claim 2, wherein parallel data aggregation operations are carried 
out within said MDB using said modular arithematic function which maps said integer-encoded 

30 MDB dimensions associated with raw or previously pre-aggregatcd data elements to be stored 
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within said MDB, into integer-encoded data storage addresses within said storage volumes at 
which the pre-aggregated data elements are to be stored 

5. The system of claim 4, wherein OLAP operations are carried out within said MDB 

S using 

said modular arithetnatic function which maps said integer-encoded MDB dimensions 
associated with pre-aggregated data elements to be accessed from said MDB, into integer- 
encoded data storage addresses within said storage volumes, from which asid pre-aggregated 
data elements are to be accessed 

10 

6. The system of claim 1 , wherein data processing tasks earned out therein are 
evenly distributed among said plurality of processors on said parallel computing platform. 

7. The system of claim I , wherein data elements within said MDB are evenly 
15 distributed among said plurality of processors on said parallel computing platform. 

8. The system of claim 1, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 
operations carried out during parallel data loading operations between a relational database 
management system (RDBMS) and said MDB within said system. 



20 



9. The system of claim 1, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 
operations carried out during parallel data aggregation operations within said MDB of said 

15 system. 

1 0. The system of claim 1 , wherein interprocessor communication among said 
plurality of processors is minimized during parallel data loading operations carried out between 
a relational database management system (RDBMS) and said MDB on said parallel computing 

K> platform. 

-36- 



SUBSTITUTE SHEET (RULE 26) 



WO 01/11497 



PCT/IBOO/01100 



5 



11. The system of claim 1, wherein interprocessor communication among said 
plurality of processors is minimized during parallel data aggregation operations carried out 
within said MDB on said parallel computing platform. 

12. The system of claim 1, wherein interprocessor communication among said 
plurality of processors is minimized during OLAP operations carried out within said MDB on 
said parallel computing platform. 



10 13. A method of accessing data elements within a multidimensional database 

(MDB) comprising: 

(a) providing a parallel computing platform having a plurality of processors and one or 
more storage volumes for physically storing data elements therein at integer-encoded physical 
addresses specified in Processor Storage Space, and wherein the location of each data element in 

15 said MDB is specified in MDB Space by integer-encoded business dimensions associated with 
said data element; 

(b) mapping the integer-encoded MDB dimensions associated with each said data 
element into an integer-encoded data storage address within said storage volumes associated 
with the MDB; and 

20 (c) using said integer-encoded data storage address to access said data element from said 

one or more storage volumes. 

14. The method of claim 12, wherein step (b) comprises mapping said integer- 
encoded MDB dimensions into said integer-encoded data storage address using a modular 

25 arithematic function. 

15. The method of claim 1 2, wherein parallel data loading operations are carried out 
between a relational database management system (RDBMS) and said MDB and step (b) 
comprises using a modular-arithmetic function to map said integer-encoded MDB dimensions 
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associated with each raw data element accessed from said RDBMS, into an integer-encoded data 
storage address within one of said storage volumes associated with said MDB. 

1 6. The method of claim 12, wherein parallel data aggregation operations are carried 
5 out within said MDB and step (b) comprises using said modular arithematic function to map 
said integer-encoded MDB dimensions associated with raw or previously ^-aggregated data 
elements to be stored within said MDB, into integer-encoded data storage addresses within said 
storage volumes at which the pre-aggregatcd data elements are to be stored. 

10 1 7. The method of claim 15, wherein OLAP operations are carried out within said 

MDB and step (b) comprises using said modular arithematic function to map said integer- 
encoded MDB dimensions associated with prc-aggregated data elements to be accessed from 
said MDB, into integer-encoded data storage addresses within said storage volumes, from which 
asid pre-aggregated data elements are to be accessed 



15 



>0 



1 8. The method of claim 12, wherein data processing tasks carried out therein are 
evenly distributed among said plurality of processors on said parallel computing platform. 

1 9. The method of claim 12, wherein data elements within said MDB are evenly 
distributed among said plurality of processors on said parallel computing platform. 



20. The method of claim 12, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 
operations carried out during parallel data loading operations between a relational database 

15 management system (RDBMS) and said MDB within said system. 

2 1 . The method of claim 1 2, wherein each said processor on said parallel computing 



operations carried out during parallel data aggregation operations within said MDB of said 
10 system. 
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22. The method of claim 12, wherein inteiprocessor communication among said 
plurality of processors is minimized during parallel data loading operations carried out between 
a relational database management system (RDBMS) and said MDB on said parallel computing 

5 .platform. 

23. The method of claim 1 2, wherein inteiprocessor communication among said 
plurality of processors is minimized during parallel data aggregation operations carried out 
within said MDB on said parallel computing platform. 

0 

24. The method of claim 12, wherein inteiprocessor communication among said 
plurality of processors is minimized during OLAP operations carried out within said MDB on 
said parallel computing platform. 

5 25. A system for managing data elements within a multidimensional database 

(MDB) comprising: 

a parallel computing platform having a plurality of processor and one or more storage 
volumes for physically storing data elements therein at integer-encoded physical addresses 
specified in Processor Storage Space, and wherein the location of each data element in said 
) MDB is specified in MDB Space by integer-encoded business dimensions associated with said 
data element; 

an address data mapping mechanism, in association with said parallel computing 
platform, for mapping the integer-encoded MDB dimensions associated with each said data 
element into an integer-encoded data storage address within said storage volumes associated 
> with the MDB; and 

a data management mechanism, in cooperation with said address data mapping 
mechanism, for managing said data element in said one or more storage volumes using said 
integer-encoded data storage address. 



-39- 



SUBST1TUTE SHEET (RULE 26) 



WO 01/11497 



PCTflB00/0110»> 



26. The system of claim 25, wherein said address data mapping mechanism 
comprises means for mapping said integer-encoded MDB dimensions into said integer-encoded 
data storage address using a modular arithematic function. 

5 27. The system of claim 25, wherein parallel data loading operations are earned out 

between a relational database management system (RDBMS) and said MDB using said modular 
ari thematic function which maps said integer-encoded MDB dimensions associated with each 
raw data element accessed from said RDBMS, into an integer-encoded data storage address 
within one of said storage volumes associated with said MDB. 

0 

28. The system of claim 25, wherein parallel data aggregation operations are carried 
out within said MDB using said modular arithematic function which maps said integer-encoded 
MDB dimensions associated with raw or previously pre-aggregated data elements to be stored 
within said MDB, into integer-encoded data storage addresses within said storage volumes at 

15 which the pre-aggregated data elements are to be stored. 

29. The system of claim 25, wherein OLAP operations are carried out within said 
MDB using said modular arithematic function which maps said integer-encoded MDB 
dimensions associated with pre-aggregated data elements to be accessed from said MDB, into 

20 integer-encoded data storage addresses within said storage volumes, from which asid pre- 
aggregated data elements are to be accessed 

30. The system of claim 25, wherein data processing tasks carried out therein are 
evenly distributed among said plurality of processors on said parallel computing platform. 

25 

3 1 . The system of claim 25, wherein data elements within said MDB are evenly 
distributed among said plurality of processors on said parallel computing platform. 

32. The system of claim 25, wherein each said processor on said parallel computing 
30 platform handles data elements assigned thereto during data elements address assignment 
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operations carried out during parallel data loading operations between a relational database 
management system (RDBMS) and said MDB within said system. 

33. The system of claim 25, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 
operations carried out during parallel data aggregation operations within said MDB of said 
system. 

34. The system of claim 25, wherein interprocessor communication among said 
plurality of processors is minimized during parallel data loading operations carried out between 
a relational database management system (RDBMS) and said MDB on said parallel computing 
platform. 

35. The system of claim 25, wherein interprocessor communication among said 
plurality of processors is minimized during parallel data aggregation operations carried out 
within said MDB on said parallel computing platform. 

36. The system of claim 25, wherein interprocessor communication among said 
plurality of processors is minimized during OLAP operations carried out within said MDB on 
said parallel computing platform. 

37. A method of managing data elements within a multidimensional database 
(MDB) comprising: 

(a) providing a parallel computing platform having a plurality of processors and one or 
more storage volumes for physically storing data elements therein at integer-encoded physical 
addresses specified in Processor Storage Space, and wherein the location of each data element in 
said MDB is specified in MDB Space by integer-encoded business dimensions associated with 
said data element; 
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(b) mapping the integer-encoded MDB dimensions associated with each said data 
element into an integer-encoded data storage address within said storage volumes associated 
with the MDB; and 

(c) using said integer-encoded data storage address to manage said data element in said 
one or more storage volumes. 

38. The method of olaim 36, wherein step (b) comprises mapping said integer- 
encoded MDB dimensions into said integer-encoded data storage address using a modular 
arithematic function. 

39. The method of claim 36, wherein parallel data loading operations are carried out 
between a relational database management system (RDBMS) and said MDB and step (b) 
comprises using a modular-arithmetic function to map said integer-encoded MDB dimensions 
associated with each raw data element accessed from said RDBMS, into an integer-encoded data 
storage address within one of said storage volumes associated with said MDB. 



40. The method of claim 36, wherein parallel data aggregation operations are carried 
out within said MDB and step (b) comprises using said modular arithematic function to map 
said integer-encoded MDB dimensions associated with raw or previously prc-aggregated data 

10 elements to be stored within said MDB, into integer-encoded data storage addresses within said 
storage volumes at which the pre-aggregated data elements are to be stored. 

41 . The method of claim 36, wherein OLAP operations are carried out within said 
MDB and step (b) comprises using said modular arithematic function to map said integer- 

5 encoded MDB dimensions associated with pre-aggregated data elements to be accessed from 
said MDB, into integer-encoded data storage addresses within said storage volumes, from which 
asid pre-aggregated data elements are to be accessed. 
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42. The method of claim 36, wherein data processing tasks carried out on said 
parallel computing platform are evenly distributed among said plurality of processors on said 
parallel computing platform. 

5 43. The method of claim 36, wherein data elements within said MDB are evenly 

distributed among said plurality of processors on said parallel computing platform. 

44. The method of claim 36, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 

1 0 operations carried out during parallel data loading operations between a relational database 
management system (RDBMS) and said MDB within said system. 

45. The method of claim 36, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 

1 5 operations carried out during parallel data aggregation operations within said MDB of said 
system. 

46. The method of claim 36, wherein interprocessor communication among said 
plurality of processors is minimized during parallel data loading operations carried out between 

20 a relational database management system (RDBMS) and said MDB on said parallel computing 
platform. 

47. The method of claim 36, wherein interprocessor communication among said 
plurality of processors is minimized during parallel data aggregation operations carried out 

25 within said MDB on said parallel computing platform. 

48. The method of claim 36, wherein interprocessor communication among said 
plurality of processors is minimized during OLAP operations carried out within said MDB on 
said parallel computing platform. 
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49. An Internet URL Directory system for supporting on-line information searching 
operations by Web-enabled client machines, said Internet URL Directory system comprising: 

a parallel computing platform having a plurality of processors and one or more storage 
volumes for physically storing a plurality of data elements of a multidimensional database 
(MDB) in said one or more storage volumes at integer-encoded physical addresses specified in 
Processor Storage Space, and wherein the location of each data element in said MDB is 
specified in MDB Space by integer-encoded business dimensions associated with said data 
element; 

an address data napping mechanism , associated with said parallel computing platform, 
for mapping the integer-encoded MDB dimensions associated with each said data element into 
an integer-encoded data storage address within said storage volumes associated with the MDB; 
and 

a data accessing mechanism, in cooperation with said address mapping mechanism, for 
accessing said data element in said one or more storage volumes using said integer-encoded data 
storage address. 

50. An Internet-enabled system for supporting real-time control of processes in 
response to complex states of information reflected in a multi-dimensional database (MDB), 
said Internet-enabled system comprising: 

a parallel computing platform having a plurality of processors and one or more storage 
volumes for physically storing a plurality of data elements of a multidimensional database 
(MDB) in said one or more storage volumes at integer-encoded physical addresses specified in 
Processor Storage Space, and wherein the location of each data element in said MDB is 
t specified in MDB Space by integer-encoded business dimensions associated with said data 
element; 

an address data mapping mechanism , associated with said parallel computing platform, 
for mapping the integer-encoded MDB dimensions associated with each said data element into 
an integer-encoded data storage address within said storage volumes associated with the MDB; 
and 
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a data accessing mechanism, in cooperation with said address mapping mechanism, for 
accessing said data element in said one or more storage volumes using said integer-encoded data 
storage address. 

5 51. A system for accessing data elements within a multidimensional database 

(MDB) comprising: 

a parallel computing platform having a plurality of processors and one or more storage 
volumes for physically storing data elements therein physical storage addresses specified in 
Processor Storage Space, and wherein the location of each data element in said MDB is 
10 specified in MDB Space by business dimensions associated with said data element; 

an address data mapping mechanism , associated with said parallel computing platform, 
for mapping the physical MDB dimensions associated with each said data clement into a 
physical data storage address within said storage volumes associated with the MDB; and 

a data accessing mechanism, in cooperation with said address mapping mechanism, for 
1 5 accessing said data element in said one or more storage volumes using physical storage address. 

52. A system for accessing data elements within a multidimensional database 
(MDB) comprising: 

10 a parallel computing platform having a plurality of processors and associated memory, 

wherein each processor is assigned a unique integer-encoded data storage address space in said 
memory for storing data elements therein, and wherein location of each data element in said 
MDB is specified in MDB Space by integer-encoded business dimensions associated with said 
data element; 

55 an address data mapping mechanism, associated with said parallel computing platform, 

for mapping the integer-encoded MDB dimensions associated with each said data element into 
integer-encoded data storage addresses within said memory; and 

a data accessing mechanism, in cooperation with said address mapping mechanism, for 
accessing a given data element in said memory using the integer-encoded data storage address 

(0 associated with the given data element 
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53. The system of claim 52, wherein said address data mapping mechanism 
comprises means for mapping said integer-encoded MDB dimensions into said integer-encoded 
data storage addresses using a modular arithemalic function. 

5 

54. The system of claim 53, wherein parallel data loading operations are carried out 
between a relational database management system (RDBMS) and said MDB using said modular 
ari thematic function which maps said integer-encoded MDB dimensions associated with each 
raw data element accessed from said RDBMS, into an integer-encoded data storage address 

10 within said memory. 

55. The system of claim 53, wherein parallel data aggregation operations are carried 
out within said MDB using said modular arithematic function which maps said integer-encoded 
MDB dimensions associated with raw or previously pre-aggregated data elements to be stored 

15 within said MDB, into integer-encoded data storage addresses within said memory at which the 
pre-aggregated data elements are to be stored. 

56. The system of claim 55< wherein OLAP operations are carried out within said 
MDB using said modular arithematic function which maps said integer-encoded MDB 

20 dimensions associated with pre-aggregated data elements to be accessed from said MDB, into 
integer-encoded data storage addresses within said memory, from which asid pre-aggregated data 
elements are to be accessed. 

57. The system of claim 52, wherein data processing tasks carried out therein are 
25 evenly distributed among said plurality of processors on said parallel computing platform. 

58. The system of claim 52, wherein said mapping performed by said address data 
mapping mechanism evenly distributes data elements among said integer-encoded address space 
in said memory associated with said plurality of processors of said parallel computing platform. 
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59. The system of claim 52, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 
operations carried out during parallel data loading operations between a relational database 
management system (RDBMS) and said MDB within said system. 

60. The system of claim 32, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 
operations carried out during parallel data aggregation operations within said MDB of said 
system. 

61. The system of claim 52, wherein interprocessor communication among said 
plurality of processors is minimized during parallel data loading operations carried out between 
a relational database management system (RDBMS) and said MDB on said parallel computing 
platform. 

62. The system of claim 52, wherein interprocessor communication among said 
plurality of processors is minimized during parallel data aggregation operations carried out 
within said MDB on said parallel computing platform. 

63. The system of claim 52, wherein interprocessor communication among said 
plurality of processors is minimized during OLAP operations carried out within said MDB on 
said parallel computing platform. 

64. The system of claim 52, wherein said integer-encoded address space in said 
memory comprises a virtual address space, and said system further comprises a virtual memory 
management system for mapping said virtual address space to real address space accessible on a 
storage device. 

65. A method for accessing data elements within a multidimensional database 
(MDB) comprising the steps of: 
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providing a parallel computing platform having a plurality of processors and associated 
memory, wherein each processor is assigned a unique integer-encoded data storage address 
space in said memory for storing data elements therein, and wherein location of each data 
element in said MDB is specified in MDB Space by integer-encoded business dimensions 
associated with said data element; 

mapping the integer-encoded MDB dimensions associated with each said data element 
into integer-encoded data storage addresses within said memory; and 

accessing a given data element in said memory using the integer-encoded data storage 
address associated with the given data element 

66. The method of claim 65, wherein said mapping step uses a modular ari thematic 
function. 

67. The method of claim 66, further comprising the step of. 

performing parallel data loading operations between a relational database management 
system (RDBMS) and said MDB using said modular arithematic function which maps said 
integer-encoded MDB dimensions associated with each raw data element accessed from said 
RDBMS, into an integer-encoded data storage address within said memory. 

68. The method of claim 66, further comprising the step of: 

performing parallel data aggregation operations within said MDB using said modular 
arithematic function which maps said integer-encoded MDB dimensions associated with raw or 
previously pre-aggregated data elements to be stored within said MDB, into integer-encoded 
data storage addresses within said memory at which the pre-aggregated data elements are to be 
stored. 

69. The method of claim 68, further comprising the steps of: 

performing OLA? operations within said MDB using said modular arithematic function 
which maps said integer-encoded MDB dimensions associated with pre-aggregated data 
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elements to be accessed from said MDB, into integer-encoded data storage addresses within said 
memory, from which asid pre-aggregated data elements are to be accessed. 

70. The method of claim 65, wherein data processing tasks carried out on said MDB are 
5 evenly distributed among said plurality of processors on said parallel computing platform. 

71. The method of claim 63, wherein said mapping performed by said address data 
mapping mechanism evenly distributes data elements among said integer-encoded address space 
in said memory associated with said plurality of processors of said parallel computing platform. 

10 

72. The method of claim 65, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 
operations carried out during parallel data loading operations between a relational database 
management system (RDBMS) and said MDB within said system. 

15 

73. The method of claim 65, wherein each said processor on said parallel computing 
platform handles data elements assigned thereto during data elements address assignment 
operations carried out during parallel data aggregation operations within said MDB of said 
system. 

20 

74. The method of claim 65, wherein interprocessor communication among said . 
plurality of processors is minimized during parallel data loading operations carried out between 
a relational database management system (RDBMS) and said MDB on said parallel computing 
platform. 

25 

75. The method of claim 65, wherein interprocessor communication among said 
plurality of processors is minimized during parallel data aggregation operations carried out 
within said MDB cm said parallel computing platform. 
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76. The method of claim 65, wherein tnterprocessor communication among said 
plurality of processors is minimized during OLAP operations carried out within said MDB on 
said parallel computing platform 

77. The method of claim 65, wherein said integer-encoded address space in said 
memory comprises a virtual address space, further comprising the step of mapping said virtual 
address space to real address space accessible on a storage device. 
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Array structure of a 
multidimensional variable 
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(PRIOR ART) 
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Page Allocation Table pointing on physical 
records of a multidimensional variable (e.g. the two 
first rows of a variable of FIG. 2B reside in page # 0) 

Page # Page of physical records ^ 
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Fig. 2D 

(PRIOR ART) 
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